The kappa statistic in reliability studies: use, interpretation, and sample size requirements.
نویسندگان
چکیده
PURPOSE This article examines and illustrates the use and interpretation of the kappa statistic in musculoskeletal research. SUMMARY OF KEY POINTS The reliability of clinicians' ratings is an important consideration in areas such as diagnosis and the interpretation of examination findings. Often, these ratings lie on a nominal or an ordinal scale. For such data, the kappa coefficient is an appropriate measure of reliability. Kappa is defined, in both weighted and unweighted forms, and its use is illustrated with examples from musculoskeletal research. Factors that can influence the magnitude of kappa (prevalence, bias, and non-independent ratings) are discussed, and ways of evaluating the magnitude of an obtained kappa are considered. The issue of statistical testing of kappa is considered, including the use of confidence intervals, and appropriate sample sizes for reliability studies using kappa are tabulated. CONCLUSIONS The article concludes with recommendations for the use and interpretation of kappa.
منابع مشابه
Use of classification tree methods to study the habitat requirements of tench (Tinca tinca) (L., 1758)
Classification trees (J48) were induced to predict the habitat requirements of tench (Tinca tinca). 306 datasets were used for the given fish during 8 years in the river basins in Flanders (Belgium). The input variables consisted of the structural-habitat (width, depth, gradient slope and distance from the source) and physic chemical (pH, dissolved oxygen, water temperature and electric conduct...
متن کاملInterrater reliability: the kappa statistic
The kappa statistic is frequently used to test interrater reliability. The importance of rater reliability lies in the fact that it represents the extent to which the data collected in the study are correct representations of the variables measured. Measurement of the extent to which data collectors (raters) assign the same score to the same variable is called interrater reliability. While ther...
متن کاملSample size requirements for interval estimation of the kappa statistic for interobserver agreement studies with a binary outcome and multiple raters.
Sample size requirements that achieve a prespecified expected lower limit for a confidence interval about the intraclass kappa statistic are supplied for the case of multiple raters and a binary outcome variable. The expected lower confidence limit achievable for a given number of subjects and raters is also presented. These results should be useful in the planning stages of an interobserver ag...
متن کاملInterval estimation for a difference between intraclass kappa statistics.
Model-based inference procedures for the kappa statistic have developed rapidly over the last decade. However, no method has yet been developed for constructing a confidence interval about a difference between independent kappa statistics that is valid in samples of small to moderate size. In this article, we propose and evaluate two such methods based on an idea proposed by Newcombe (1998, Sta...
متن کاملSample optimality in the design of stated choice experiments Submitted
Recent research by Bliemer and Rose (2005, 2009, in press) and Rose and Bliemer (2005) suggest as a measure for calculating sample size requirements for models estimated using stated choice data, the S-error statistic. Prior to this, existing sampling theories failed to adequately address the issue of sample size requirements specifically for this type of data and hence researchers have had to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Physical therapy
دوره 85 3 شماره
صفحات -
تاریخ انتشار 2005